Skip to content

[SYCL] Implement atomic_memory_scope_capabilities device query for OpenCL and Level Zero #8595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 22, 2023

Conversation

maarquitos14
Copy link
Contributor

Adds support to query devices for atomic_memory_scope_capabilities. The backends supported are OpenCL and Level Zero. For the rest of backends, it has been left unsupported.

@maarquitos14 maarquitos14 requested review from a team as code owners March 9, 2023 16:11
@maarquitos14 maarquitos14 temporarily deployed to aws March 9, 2023 16:43 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 10, 2023 10:03 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 10, 2023 10:31 — with GitHub Actions Inactive
Comment on lines 311 to 316
// Because scopes are hierarchical, wider scopes support all narrower
// scopes. SUB_GROUP and WORK_ITEM were already included in the
// initialization, since WORK_GROUP is mandated minimum capality.
if (devCapabilities && PI_DEVICE_ATOMIC_SCOPE_DEVICE) {
result |= PI_MEMORY_SCOPE_DEVICE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I suppose that's the answer to my question above about default value. Do you have a pointer to spec section which describes the hierarchical nature of those scopes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from OpenCL Spec: https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES

However, you can see there that WORK_ITEM is actually an exception in OpenCL. I had an offline discussion with @gmlueck, @bashbaug and @Pennycook about this, and the conclusion is that SYCL should always return WORK_ITEM. I think there will be an update in the spec to detail this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks! Perhaps it would be better to put that link to OCL spec directly into sources next to the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but the OCL spec explicitly says WORK_ITEM is an exception, so that might be confusing because we are not treating it as an exception. When SYCL spec is updated, we can obviously add a ref to that next to the comment. For now, I'm not sure if referencing OCL spec is helpful or more confusing. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this code is within OpenCL plugin and therefore I think it shouldn't be too confusing to leave links to OpenCL spec in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this offline. There is a corner case with WORK_ITEM scope being defined differently between SYCL and OpenCL, which is a reason for ignoring this OpenCL exception.

I think that it would still make sense to leave an OCL spec link, but we should have a comment about why we ignore that work-item exception. If we already have a link to SYCL spec clarification PR/issue, then even better, we can link it there as well.

Copy link
Contributor Author

@maarquitos14 maarquitos14 Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extended the current comment with further explanation and a link to OCL spec. Please, let me know if this is enough.

@maarquitos14 maarquitos14 temporarily deployed to aws March 10, 2023 14:36 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 10, 2023 15:16 — with GitHub Actions Inactive
Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14 maarquitos14 temporarily deployed to aws March 13, 2023 17:05 — with GitHub Actions Inactive
Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14 maarquitos14 temporarily deployed to aws March 13, 2023 18:49 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 13, 2023 19:47 — with GitHub Actions Inactive
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

@@ -11,7 +11,7 @@
// Suppress a compiler message about undefined CL_TARGET_OPENCL_VERSION
// and define all symbols up to OpenCL 2.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// and define all symbols up to OpenCL 2.2
// and define all symbols up to OpenCL 3.0

Comment on lines 338 to 340
if (paramValueSizeRet)
*paramValueSizeRet = sizeof(result);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (paramValueSizeRet)
*paramValueSizeRet = sizeof(result);
}
}
if (paramValueSizeRet)
*paramValueSizeRet = sizeof(result);

paramValueSizeRet should be filled even if paramValue is nullptr.
That is needed to support (usually) initial query:

piDeviceInfoGet(device, paramName, nullptr, 0, &size_ret);
// now allocate size_ret bytes and query result

@@ -1787,7 +1842,7 @@ pi_result piextKernelGetNativeHandle(pi_kernel kernel,
// Windows: dynamically loaded plugins might have been unloaded already
// when this is called. Sycl RT holds onto the PI plugin so it can be
// called safely. But this is not transitive. If the PI plugin in turn
// dynamically loaded a different DLL, that may have been unloaded.
// dynamically loaded a different DLL, that may have been unloaded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an unrelated change, better to avoid

Comment on lines 311 to 316
// Because scopes are hierarchical, wider scopes support all narrower
// scopes. SUB_GROUP and WORK_ITEM were already included in the
// initialization, since WORK_GROUP is mandated minimum capality.
if (devCapabilities && PI_DEVICE_ATOMIC_SCOPE_DEVICE) {
result |= PI_MEMORY_SCOPE_DEVICE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this code is within OpenCL plugin and therefore I think it shouldn't be too confusing to leave links to OpenCL spec in there.

@AlexeySachkov
Copy link
Contributor

AlexeySachkov commented Mar 16, 2023

@intel/dpcpp-esimd-reviewers, could you please take a look at esimd_emulator part of the PR? It is fairly simple

Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esimd emu lgtm

Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14 maarquitos14 requested a review from a team as a code owner March 16, 2023 16:01
Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14 maarquitos14 temporarily deployed to aws March 16, 2023 16:31 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 16, 2023 17:01 — with GitHub Actions Inactive
@maarquitos14
Copy link
Contributor Author

Friendly ping @intel/llvm-reviewers-runtime @intel/dpcpp-l0-pi-reviewers

Comment on lines 1169 to 1178
case UR_DEVICE_INFO_ATOMIC_MEMORY_SCOPE_CAPABILITIES: {
// There are no explicit restrictions in L0 programming guide, so assume all
// are supported
pi_memory_scope_capabilities result =
PI_MEMORY_SCOPE_WORK_ITEM | PI_MEMORY_SCOPE_SUB_GROUP |
PI_MEMORY_SCOPE_WORK_GROUP | PI_MEMORY_SCOPE_DEVICE |
PI_MEMORY_SCOPE_SYSTEM;

return ReturnValue(result);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jandres742 : please review this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assumption is correct for L0 GPU driver, based on what we return already in OpenCL GPU driver.

Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jandres742 jandres742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@maarquitos14 maarquitos14 temporarily deployed to aws March 17, 2023 12:50 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 17, 2023 13:36 — with GitHub Actions Inactive
Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14 maarquitos14 temporarily deployed to aws March 17, 2023 15:57 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 17, 2023 16:26 — with GitHub Actions Inactive
@maarquitos14
Copy link
Contributor Author

@intel/llvm-reviewers-runtime assigned reviewer is on holiday, could you please take a look?

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀

Signed-off-by: Maronas, Marcos <[email protected]>
@maarquitos14 maarquitos14 temporarily deployed to aws March 22, 2023 16:32 — with GitHub Actions Inactive
@maarquitos14 maarquitos14 temporarily deployed to aws March 22, 2023 17:52 — with GitHub Actions Inactive
@bader bader changed the title [SYCL] Implements atomic_memory_scope_capabilities device query for OpenCL and Level Zero [SYCL] Implement atomic_memory_scope_capabilities device query for OpenCL and Level Zero Mar 22, 2023
@bader bader merged commit 2816c14 into intel:sycl Mar 22, 2023
@bader
Copy link
Contributor

bader commented Mar 22, 2023

@maarquitos14, sycl/unittests/SYCL2020/AtomicMemoryScopeCapabilities.cpp fails on Windows. Please, fix ASAP.

veselypeta pushed a commit to veselypeta/llvm that referenced this pull request Sep 21, 2023
…enCL and Level Zero (intel#8595)

Adds support to query devices for `atomic_memory_scope_capabilities`.
The backends supported are OpenCL and Level Zero. For the rest of
backends, it has been left unsupported.

---------

Signed-off-by: Maronas, Marcos <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants